VAST Challenge 2017 MC1
Master in Data Mining and Knowledge Discovery
Universidad de Buenos Aires
Student Team: Blanco, Daniela; Cabrera,
Omar; Centurión, Emmanuel; Demarchi, Agustín;
Garber, Leandro; Liberman, Gaston; Oyola, Diego; Sokil, Juan Pablo
This challenge proposes to find different types of patterns: daily and multiple
day.
The approach used
to find patterns was based in the quantity of days that vehicles stayed in the
park. As a result of that, we decided to subset the dataset in two. On the one
hand, those vehicles that entered and exited the park at the same day; on the
other hand, those vehicles that entered and exited the park in different days.
For example, a set of vehicles that entered one day at 23 pm and left the park
at 2 am the next day or later are considered as part of a multiple day pattern.
Using that criteria we identified 19,489 vehicles that
entered the park within one year (from May 2015 to May 2016), with the following
type and monthly distribution.
|
|
After this
division we found out that the patterns were very different during the months
of the year, which seems related with a seasonal behavior. Winter and summer were
the seasons with less and more entries respectively.
The first
approach to analyze trajectories was using Gephi,
creating a directed graph that allows us to see the similarities and
differences between daily (left image) and multiple day trajectories (right
image).
|
|
The graph shows
connections between the different points in the park (gates, campings, entrances). The bigger the edge, the more the
vehicles that went through that path.
Another approach
used for trajectories was the sunburst plot (Ms-Excel).
That allows us to see the differences from the daily and multiple day vehicles
(at the left we can see the daily trajectories and at the right the multiple
day ones) showing the proportion of point-to-point routes. The visualization
shows by colour, the starting point of the vehicle,
and inside each subdivision the proportion of each ending point in the
trajectory.
|
|
A visualization
was designed to include time reference. We classified the different sensors
using a hierarchical cluster (R, ggplot library), and
we found that there are different patterns in each month/season of the year.
After that we looked
for different kinds of approaches to figure out patterns considering the type
of cars trajectories (Tableau)
and also day of
the week trajectories (tableau)
and with a
directed graph that we designed by the team that allows to investigate all
trajectories in a separated way. (D3)
Let’s see what we found:
Daily patterns
First pattern: Tourist buses that visit the park
The following
route: Entrance 2-General Gate 5 – General Gate 2 – Ranger Stop 0 – Ranger Stop
2 – General Gate 1-Entrance 0.
Tourist buses
(type 5 and 6). They make a 39-minute average trip for the reservation. The
park's vehicles are kept 204 minutes in average inside the park, and the rest
of the types of cars have a very dispersed behavior (much variability, which
does not allow to infer a frequent behavior).
Second pattern: Trucks crossing the Park
The following
route: Entrance1-General Gate 7-Entrance 3.
They
are vehicles that cross the park
without stopping. They have short runs, 27 minutes on average. 88% of
these cases are type 2 (trucks with trailers). No specific pattern is observed throughout
time or season (flow is even along space-time).
Third Pattern: Camping maintenance
The following
route: Ranger Stop 5-gate 4-General Gate 3.
They are
almost all 2 p vehicles (93%) leaving the parking a lot and crossing
unauthorized spaces until arriving at Camping 8. They do it in the months of
December and January and during working hours (8 am to 8 pm). They carry out
the maintenance of that camping.
Fourth pattern: Park extension works.
The following
route: Entrance1-Gate 2 – Ranger Stop 1
96% of them
are 2 p vehicles. They travel between the 5 and 10 pm.
The vehicles
cross the Gate2 and quickly arrive (5 minutes) to the RANGERSTOP1 (they do not
stop), there they stay, they go and they come, pass repeatedly inside the
sensor without traversing another one. They are kept within that sensor for
about 1:30 hours. They are employees of the park who are working on the
extension of the venue, probably making a route to improve the connections.
Fifth pattern: Picnic
The following
route: Entrance 2-General Gate 3 – Camping 8
People who go
camping during the day, go to camping 8 which is the most frequented and the
closest. The turnout increases on the weekends, and during the summer months
considerably increases the amount of concurrence (both on weekends and in the
rest of the days). In most cases they are type 1 cars.
Sixth pattern: Passing vehicles
They carry
out the following routes:
Entrance
0-General Gate 4-General Gate 7-Entrance 1
Two behaviors
were clearly defined. In the first case, used by all types of vehicles
(although to a greater extent of cars or two-axle trucks), crossing from 0 to
1, avoid the camping area. They delay between 30 and 40 minutes, taking on
average 12 minutes for the first and last tranche, and another 7 for the intermediate,
so it is assumed that they do not stop. They use the road to travel.
The second
case, in the opposite direction and the same way, averages 32 minutes, and is
also a frequent pattern, used by all types of vehicles.
Multiple day patterns
First Pattern: Becoming one with the nature
Entrance2-General-Gate2-ranger-Stop0-ranger-Stop2-General-gate1-general-gate4-general-Gate7-Camping5-Camping5-General-Gate7-Entrance3,
Vehicles that
arrive on Saturday morning to the camping 5 and stay until Sunday noon. They
cross all the park to the most remote camping. There are mostly Type 1 vehicles
(50%) and the rest is divided between type 2 and 3 vehicles.
Second Pattern: Wild Weekend
They carry
out the following routes: Entrance1- camping4 - camping4 - general-gate7 -
entrance3,
These are
vehicles that come and stay in the first camping closest to entrance 1,
probably they are young people that go at night to party at the park. They are
not interested in the nature. They just want to be free of adults.
Third Pattern: Camping Weekend
They carry out the following routes: Entrance1 - camping2 - camping2 -
general-gate7 - entrance3,
These are vehicles that arrive at any time, especially during summer
months, they don’t have much time, but they enjoy though. Probably, they are
families that goes on a camping weekend, probably family with kids.
Fourth pattern: Camping with accommodations
Entrance0 -
general-gate1 - ranger-stop2 - ranger-stop0 - general-gate2 - general-gate5-
camping6 - camping6 - general-gate5 - general-gate2 - ranger-stop0 -
ranger-stop2 - general-gate1
They cross along
the park and go camping close to the ranger base. Probably, they want to stay
in a common area where they can buy food, drinks and other stuff. In that place
you can rent a car, so maybe they leave their cars there and get a
transportation service to travel within the park.
Fifth pattern: Bird´s fans
Entrance3- general-gate7 - camping5 - camping5 -
general-gate7 - entrance3
They go directly to the camping5 (the most remote one). This is probably one of
the best points to sightsee birds in their natural habitat, so they go there
and camp to see the birds at dawn.
Sixth pattern: A week in the forest
There are those
who prefer to spend a whole week in the park, mainly motorbikes and cars but
they prefer different campings to stay: bikes are
quite fond of campings 4 and 5, but cars of 2 and 3.
Unusual patterns
First pattern: 2
Axle-Trucks at dawn by prohibited zone
On Tuesdays and Thursdays 2 axle-trucks have appeared
in not allowed trajectories from 2 am to 5 am, probably with the intention to
dump wastes.
Second pattern:
vehicle that have stayed in the park in the whole period
Every month they stayed in a different camping.
Third pattern: Night camping fan.
It is a type 1
vehicle that enters at 22-23 p.m. on Sundays from entrance 0 and stays for 2
days in camping 6, and then leaves from entrance 0, after 5 days they come back
to the park and do the same trajectory. The curiosity about this case is that
the carID does not change when returning to the park.
Fourth pattern: Weekend camping fan.
Type 1
Vehicle that stays from Friday’s afternoon to Monday’s midnight, repeating this
sequence for more than 3 months. For the first time they entered through entrance
1 and stayed at camping 4 for the weekend, then left through entrance 4. Since
then, it has entered through entrance 4 and stayed in the same camping and
leaving through the same entrance they entered.
Fifth pattern: Group of type 1 vehicles stopping at a
not allowed stop.
On 10/07/2015
at 10 a.m. - 6 Type 1 vehicles spent the afternoon at the ranger-stop 1
checkpoint which is not an allowed stop.
Sixth pattern: Vehicles that avoid the gate2 sensor
These are
vehicles that don’t go through the sensor, probably they cross the park for a
not allowed area. They also share the same route between 7 hours and 13 hours a
day 20/07/2015. The road corresponds to Entrance1-RangerStop0-Entrance1.
Dangerous patterns
Unusual
pattern No. 1: Trucks at dawn by zone prohibited, probably unloading garbage.
Frequent
daily pattern No. 6: Passing vehicles. The constant passage of vehicles
generates sound pollution that keeps birds away.
Unusual
pattern No. 6: Vehicles that bypass the sensor. This indicates that they access
to protected areas and disturb the birds.